DataRobot detects the date and/or time format (<a target="_blank" href="https://docs.python.org/2/library/datetime#strftime-and-strptime-behavior">standard GLIBC strings</a>) for the selected feature. Verify that it is correct. If the format displayed does not accurately represent the date column(s) of your dataset, modify the original dataset to match the detected format and re-upload it.

![](images/otp-detect.png)

Configure the backtesting partitions. You can set them from the dropdowns (applies global settings) or by clicking the [bars in the visualization](#change-backtest-partitions) (applies individual settings). Individual settings override global settings. Once you modify settings for an individual backtest, any changes to the global settings are not applied to the edited backtest.

![](images/otp-backtesting.png)

??? info "Date/date range representation"
    DataRobot uses <em>date points</em>  to represent dates and date ranges within the data, applying the following principles:

    * All date points adhere to ISO 8601, UTC (e.g., 2016-05-12T12:15:02+00:00), an internationally accepted way to represent dates and times, with some small variation in the duration format. Specifically, there is no support for ISO weeks (e.g., P5W).

    * Models are trained on data between two ISO dates. DataRobot displays these dates as a date range, but inclusion decisions and all key boundaries are expressed as date points. When you specify a date, DataRobot includes start dates and excludes end dates.

    * Once changes are made to formats using the date partitioning column, DataRobot converts all charts, selectors, etc. to this format for the project.

## Set backtest partitions globally {: #set-backtest-partitions-globally }

The following table describes global settings:

|  | Selection | Description |
|---|---|---|
| ![](images/icon-1.png) | [Number of backtests](#set-the-number-of-backtests) | Configures the number of backtests for your project, the time-aware equivalent of cross-validation (but based on time periods or durations instead of random rows). |
| ![](images/icon-2.png) | [Validation length](#set-the-validation-length)  | Configures the size of the testing data partition. |
| ![](images/icon-3.png) | [Gap length](#set-the-gap-length) | Configures spaces in time, representing gaps between model training and model deployment.|
| ![](images/icon-4.png) | [Sampling method](#set-rows-or-duration) | Sets whether to use duration or rows as the basis for partitioning, and whether to use random or latest data.|



See the table above for a description of the backtesting section's display elements.

!!! note
    When changing partition year/month/day settings, note that the month and year values rebalance to fit the larger class (for example, 24 months becomes two years) when possible. However, because DataRobot cannot account for leap years or days in a month as it relates to your data, it cannot convert days into the larger container.

### Set the number of backtests {: #set-the-number-of-backtests }

You can change the number of [backtests](#understanding-backtests), if desired. The default number of backtests is dependent on the project parameters, but you can configure up to 20. Before setting the number of backtests, use the histogram to validate that the training and validation sets of each fold will have sufficient data to train a model. Requirements are:

* For OTV, backtests require at least 20 rows in each validation and holdout fold and at least 100 rows in each training fold. If you set a number of backtests that results in any of the partitions not meeting that criteria, DataRobot only runs the number of backtests that do meet the minimums (and marks the display with an asterisk).

* For time series, backtests require at least 4 rows in validation and holdout and at least 20 rows in the training fold. If you set a number of backtests that results in any of the partitions not meeting that criteria, the project could fail. See the [time series partitioning reference](ts-customization) for more information.


![](images/otp-backtest-set.png)

By default, DataRobot creates a holdout fold for training models in your project. [In some cases](ts-date-time#partition-without-holdout), however, you may want to create a project without a holdout set. To do so, uncheck the **Add Holdout fold** box. If you disable the holdout fold, the holdout score column does not appear on the Leaderboard (and you have no option to unlock holdout). Any tabs that provide an option to switch between Validation and Holdout will not show the Holdout option.

!!! note
    If you build a project with a single backtest, the Leaderboard does not display a backtest column.

### Set the validation length {: #set-the-validation-length }

To modify the duration, perhaps because of a warning message, click the dropdown arrow in the **Validation length** box and enter duration specifics. Validation length can also be set by [clicking the bars](#change-backtest-partitions) in the visualization. Note the change modifications make in the testing representation:

![](images/otp-val-length-1.png)

### Set the gap length {: #set-the-gap-length }

Optionally, set the [gap](#understanding-gaps) length from the **Gap Length** dropdown. Initially set to zero, DataRobot does not process a gap in testing. When set, DataRobot excludes the data that falls in the gap from use in training or evaluation of the model. Gap length can also be set by [clicking the bars](#change-backtest-partitions) in the visualization.

![](images/otp-gap-length.png)

### Set rows or duration {: #set-rows-or-duration }

By default, DataRobot ensures that each backtest has the same _duration_, either the default or the values set from the dropdown(s) or via the [bars in the visualization](#change-backtest-partitions). If you want the backtest to use the same number of _rows_, instead of the same length of time, use the **Equal rows per backtest** toggle:

![](images/otp-force-rows.png)

Time series projects also have an option to set row or duration for the training data, used as the basis for feature engineering, in the [training window format](ts-customization#duration-and-row-count) section.

Once you have selected the mechanism/mode for assigning data to backtests, select the sampling method, either **Random** or **Latest**, to select how to assign rows from the dataset.

Setting the sampling method is particularly useful if a dataset is not distributed equally over time. For example, if data is skewed to the most recent date, the results of using 50% of random rows versus 50% of the latest will be quite different. By selecting the data more precisely, you have more control over the data that DataRobot trains on.

## Change backtest partitions {: #change-backtest-partitions }

If you don't modify any settings, DataRobot disperses rows to backtests equally. However,  you can customize an individual backtest's gap, training, validation, and holdout data by clicking the corresponding bar or the pencil icon (![](images/icon-pencil.png)) in the visualization. Note that:

* You can only set holdout in the Holdout backtest ("backtest 0"), you cannot change the training data size in that backtest.

* If, during the initial partitioning detection, the backtest configuration of the ordering (date/time) feature, series ID, or target results in insufficient rows to cover both validation and holdout, DataRobot automatically disables holdout. If other partitioning settings are changed (validation or gap duration, start/end dates, etc.), holdout is not affected unless manually disabled.


* When **Equal rows per backtest** is checked (which sets the partitions to row-based assignment), only the Training End date is applicable.

* When **Equal rows per backtest** is checked, the dates displayed are informative only (that is, they are approximate) and they include padding that is set by the feature derivation and forecast point windows.

### Edit individual backtests {: #edit-individual-backtests }

Regardless of whether you are setting training, gaps, validation, or holdout, elements of the editing screens function the same. Hover on a data element to display a tooltip that reports specific duration information:

![](images/otp-hover.png)

Click a section (1) to open the tool for modifying the start and/or end dates; click in the box (2) to open the calendar picker.

![](images/otp-reset-partition.png)

Triangle markers provide indicators of corresponding boundaries. The larger blue triangle (![](images/icon-blue-triangle.png)) marks the active boundary&mdash;the boundary that will be modified if you apply a new date in the calendar picker. The smaller orange triangle (![](images/icon-orange-triangle.png)) identifies the other boundary points that can be changed but are not currently selected.

The current duration for training, validation, and gap (if configured) is reported under the date entry box:

![](images/otp-report.png)

Once you have made changes to a data element, DataRobot adds an **EDITED** label to the backtest.

![](images/otp-edit.png)

There is no way to remove the **EDITED** label from a backtest, even if you manually reset the durations back to the original settings. If you want to be able to apply global duration settings across all backtests, [copy the project](manage-projects#project-actions-menu) and restart.

### Modify training and validation {: #modify-training-and-validation }

To modify the duration of the training or validation data for an individual backtest:

1. Click in the backtest to open the calendar picker tool.
2. Click the triangle for the element you want to modify&mdash;options are training start (default), training end/validation start, or validation end.
3. Modify dates as required.

### Modify gaps {: #modify-gaps }

A gap is a period between the end of the training set and the start of the validation set, resulting in data being intentionally ignored during model training. You can set the [gap](#gaps) length globally or for an individual backtest.

To set a gap, add time between training end and validation start. You can do this by ending training sooner, starting validation later or both.

1. Click the triangle at the end of the training period.

2. Click the **Add Gap** link.

	![](images/otp-add-gap.png)

	DataRobot adds an additional triangle marker. Although they appear next to each other, both the selected (blue) and inactive (orange) triangles represent the same date. They are slightly spaced to make them selectable.

3. Optionally, set the **Training End Date** using the calendar picker. The date you set will be the beginning of the gap period (training end = gap start).

4. Click the orange **Validation Start Date** marker; the marker changes to blue, indicating that it's selected.

5. Optionally, set the Validation Start Date (validation start = gap end).

The gap is represented by a yellow band; hover over the band to view the duration.

### Modify the holdout duration {: #modify-the-holdout-duration }

To modify the holdout length, click in the red (holdout area) of backtest 0, the holdout partition. Click the displayed date in the **Holdout Start Date** to open the calendar picker and set a new date. If you modify the holdout partition and the new size results in potential problems, DataRobot displays a warning icon next to the Holdout fold. Click the warning icon (![](images/icon-warning.png)) to expand the dropdown and reset the duration/date fields.

![](images/otp-holdout-warn.png)

### Lock the duration {: #lock-the-duration }

You may want to make backtest <em>date</em> changes without modifying the duration of the selected element. You can lock duration for training, for validation, or for the combined period. To lock duration, click the triangle at one end of the period. Next, hold the **Shift** key and select the triangle at the other end of the locked duration. DataRobot opens calendar pickers for each element:

![](images/otp-duration-locked.png)

Change the date in either entry. Notice that the other date updates to mirror the duration change you made.


## Interpret the display {: #interpret-the-display }

The date/time partitioning display represents the training and validation data partitions as well as their respective sizes/durations. Use the visualization to ensure that your models are validating on the area of interest. The chart shows, for each backtest, the specific time period of values for the training, validation, and if applicable, holdout and gap data. Specifically, you can observe, for each backtest, whether the model will be representing an interesting or relevant time period. Will the scores represent a time period you care about? Is there enough data in the backtest to make the score valuable?

![](images/otp-rep.png)

The following table describes elements of the display:


|   Element    |   Description |
|--------------|---------------|
| Observations    | The [binned](lift-chart#lift-chart-binning) distribution of values (i.e., frequency), before downsampling, across the dataset. This is the same information as displayed in the feature’s histogram.  |
| Available Training Data   | The blue color bar indicates the training data available for a given fold. That is, all available data minus the validation or holdout data.  |
| Primary Training Data   | The dashed outline indicates the maximum amount of data you can train on to get scores from all backtest folds. You can later choose any time window for training, but depending on what you select, you may not then get all backtest scores. (This could happen, for example, if you train on data greater than the primary training window.) If you train on data less than or equal to the Primary Training Data value, DataRobot completes all backtest scores. If you train on data greater than this value, DataRobot runs fewer tests and marks the backtest score with an asterisk (\*). This value is dependent on (changed by) the number of configured backtests. |
|  Gap   | A gap between the end of the training set and the start of the validation set, resulting in the data being intentionally ignored during model training.  |
|  Validation   | A set of data indicated by a green bar that is not used for training (because DataRobot selects a different section at each backtest). It is similar to traditional [validation](partitioning), except that it is time based. The validation set starts immediately at the end of the primary training data (or the end of the gap). |
| Holdout (only if **Add Holdout fold** is checked) | The reserved (never seen) portion of data used as a final test of model quality once the model has been trained and validated. When using date/time partitioning, [holdout](data-partitioning) is a duration or row-based portion of the training data instead of a random subset. By default, the holdout data size is the same as the validation data size and always contains the latest data. (Holdout size is user-configurable, however.)  |
| Backtest*x*    | Time- or row-based folds used for training models. The Holdout backtest is known as "backtest 0" and labeled as Holdout in the visualization. For small datasets and for the highest-scoring model from Autopilot, DataRobot runs all backtests. For larger datasets, the first backtest listed is the one DataRobot uses for model building. Its score is reported in the Validation column of the Leaderboard. Subsequent backtests are not run until manually initiated on the Leaderboard. |



Additionally, the display includes **Target Over Time** and **Observations** histograms. Use these displays to visualize the span of times where models are compared, measured, and assessed&mdash;to identify "regions of interest." For example, the displays help to determine the density of data over time, whether there are gaps in the data, etc.

![](images/otp-graphs.png)

In the displays, the green represents the selection of data that DataRobot is validating the model on. The "All Backtest" score is the average of this region. The gradation marks each backtest and its potential overlap with training data.

Study the **Target Over Time** graph to find interesting regions where there is some data fluctuation. It may be interesting to compare models over these regions. Use the **Observations** chart to determine whether, roughly speaking, the amount of data in a particular backtest is suitable.

Finally, you can click the red, locked holdout section to see where in the data the holdout scores are being measured and whether it is a consistent representation of your dataset.
